Skip to content

vm-import: fix unmanaged instance listing#5400

Merged
sureshanaparti merged 3 commits into
apache:4.16from
shapeblue:fix-unmanagedinstances-listing
Feb 3, 2022
Merged

vm-import: fix unmanaged instance listing#5400
sureshanaparti merged 3 commits into
apache:4.16from
shapeblue:fix-unmanagedinstances-listing

Conversation

@shwstppr
Copy link
Copy Markdown
Contributor

@shwstppr shwstppr commented Sep 1, 2021

Description

When the host and last host ID is not set for the VM, it may appear in the list of unmanaged instances.
This change fixes the behaviour by filtering unmanaged instances list for the host for the following three criteria:

  • host is set as host_id for the VM
  • host is set as the last_host_id for the VM
  • pod of the host is set as the pod_id for the VM and both host_id and last_host_id is NULL

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

  • Major
  • Minor

Bug Severity

  • BLOCKER
  • Critical
  • Major
  • Minor
  • Trivial

Screenshots (if appropriate):

How Has This Been Tested?

Clusters:

(localcloud) SBCM5> > list clusters filter=id,name,podid,podname
{
  "cluster": [
    {
      "id": "3d79b022-0fd9-4f51-805d-ef71bef562c5",
      "name": "p1-c1",
      "podid": "c9e2eccc-9c94-4a9e-a971-fa15a0bf59c2",
      "podname": "Pod1"
    },
    {
      "id": "a5b2a7b8-f897-4d5f-889f-a9f1321e7c2b",
      "name": "10.0.35.234/Trillian/p1-c2",
      "podid": "c9e2eccc-9c94-4a9e-a971-fa15a0bf59c2",
      "podname": "Pod1"
    }
  ],
  "count": 2
}

Zone has two VMs. One managed VM - t1. Other unmanaged VM - test123. In the below tests, without PR changes ACS wrongly lists both VMs as unmanaged VMs. After changes ACS correctly lists only test123 as the unmanaged VM.

VMs list in the environment after inter-cluster migration of VM - t1:

(localcloud) SBCM5> > list virtualmachines filter=id,name
{
  "count": 1,
  "virtualmachine": [
    {
      "id": "7ce5dab4-ddad-4f25-85f1-8e3992ccb0a0",
      "name": "t1"
    }
  ]
}
(localcloud) SBCM5> > list volumes virtualmachineid=7ce5dab4-ddad-4f25-85f1-8e3992ccb0a0 
{
  "count": 1,
  "volume": [
    {
      "account": "admin",
      "chaininfo": "{\"diskDeviceBusName\":\"ide0:1\",\"diskChain\":[\"[4f37137963f1317e9709563c8f57983e] i-2-5-VM/i-2-5-VM.vmdk\"]}",
      "clusterid": "3d79b022-0fd9-4f51-805d-ef71bef562c5",
      "clustername": "p1-c1",
      "created": "2021-09-06T10:17:07+0000",
      "destroyed": false,
      "deviceid": 0,
      "diskioread": 27,
      "diskiowrite": 9,
      "diskkbsread": 1155,
      "diskkbswrite": 74,
      "displayvolume": true,
      "domain": "ROOT",
      "domainid": "09046ad8-0bda-11ec-a29c-1e0094000118",
      "hypervisor": "VMware",
      "id": "c1340b3e-37dc-4959-af84-e1b5e7395efe",
      "isextractable": true,
      "name": "ROOT-5",
      "path": "i-2-5-VM",
      "podid": "c9e2eccc-9c94-4a9e-a971-fa15a0bf59c2",
      "podname": "Pod1",
      "provisioningtype": "thin",
      "quiescevm": false,
      "serviceofferingdisplaytext": "Small Instance",
      "serviceofferingid": "0820dfc7-d843-4120-9df6-ee4e216083f5",
      "serviceofferingname": "Small Instance",
      "size": 2147483648,
      "state": "Ready",
      "storage": "ps1",
      "storageid": "4f371379-63f1-317e-9709-563c8f57983e",
      "storagetype": "shared",
      "supportsstoragesnapshot": false,
      "tags": [],
      "templatedisplaytext": "CentOS 5.3(64-bit) no GUI (vSphere)",
      "templateid": "0907e0f0-0bda-11ec-a29c-1e0094000118",
      "templatename": "CentOS 5.3(64-bit) no GUI (vSphere)",
      "type": "ROOT",
      "virtualmachineid": "7ce5dab4-ddad-4f25-85f1-8e3992ccb0a0",
      "vmdisplayname": "t1",
      "vmname": "t1",
      "vmstate": "Stopped",
      "zoneid": "1991b455-cebf-4507-88c4-8c8a467971c3",
      "zonename": "pr4774-t1933-vmware-67u3"
    }
  ]
}

Unmanaged instances list before fix:

(localcloud) SBCM5> > list unmanagedinstances clusterid=3d79b022-0fd9-4f51-805d-ef71bef562c5 filter=name,hostid,hostname,clusterid
{
  "count": 2,
  "unmanagedinstance": [
    {
      "clusterid": "3d79b022-0fd9-4f51-805d-ef71bef562c5",
      "hostid": "4b7798ef-745c-4595-9715-be976dfbe963",
      "hostname": "10.0.34.154",
      "name": "test123"
    },
    {
      "clusterid": "3d79b022-0fd9-4f51-805d-ef71bef562c5",
      "hostid": "4b7798ef-745c-4595-9715-be976dfbe963",
      "hostname": "10.0.34.154",
      "name": "i-2-5-VM"
    }
  ]
}

Unmanaged instances list after fix:

(localcloud) SBCM5> > list unmanagedinstances clusterid=3d79b022-0fd9-4f51-805d-ef71bef562c5 filter=name,hostid,hostname,clusterid
{
  "count": 1,
  "unmanagedinstance": [
    {
      "clusterid": "3d79b022-0fd9-4f51-805d-ef71bef562c5",
      "hostid": "4b7798ef-745c-4595-9715-be976dfbe963",
      "hostname": "10.0.34.154",
      "name": "test123"
    }
  ]
}

@shwstppr
Copy link
Copy Markdown
Contributor Author

shwstppr commented Sep 2, 2021

@blueorangutan package

@blueorangutan
Copy link
Copy Markdown

@shwstppr a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@blueorangutan
Copy link
Copy Markdown

Packaging result: ✔️ el7 ✔️ el8 ✔️ debian ✔️ suse15. SL-JID 1099

@yadvr
Copy link
Copy Markdown
Member

yadvr commented Oct 3, 2021

@blueorangutan test centos7 vmware-67u3

@blueorangutan
Copy link
Copy Markdown

@rhtyd a Trillian-Jenkins test job (centos7 mgmt + vmware-67u3) has been kicked to run smoke tests

@blueorangutan
Copy link
Copy Markdown

Trillian test result (tid-2304)
Environment: vmware-67u3 (x2), Advanced Networking with Mgmt server 7
Total time taken: 57663 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr5400-t2304-vmware-67u3.zip
Smoke tests completed. 88 look OK, 5 have errors
Only failed tests results shown below:

Test Result Time (s) Test File
test_01_redundant_vpc_site2site_vpn Failure 515.86 test_vpc_vpn.py
test_deploy_vm_start_failure Error 116.96 test_deploy_vm.py
test_deploy_vm_volume_creation_failure Error 97.92 test_deploy_vm.py
test_vm_ha Error 91.45 test_vm_ha.py
test_01_unmanage_vm_cycle Error 45.18 test_vm_life_cycle.py
test_vm_sync Error 230.95 test_vm_sync.py

@DaanHoogland
Copy link
Copy Markdown
Contributor

@shwstppr Can you double check your test output? I see 't1' as a vm name in the initial state, and then 'test123' and 'i-2-5-VM' in the final report. I'm thinking 't1' and 'test123' are meant to be the same, but please confirm (is this output from two different test runs, or am I missing the point?)

@yadvr
Copy link
Copy Markdown
Member

yadvr commented Oct 8, 2021

@blueorangutan package

@blueorangutan
Copy link
Copy Markdown

@rhtyd a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@blueorangutan
Copy link
Copy Markdown

Packaging result: ✖️ el7 ✖️ el8 ✖️ debian ✖️ suse15. SL-JID 1528

@yadvr yadvr changed the base branch from main to 4.16 November 15, 2021 10:09
@yadvr yadvr added this to the 4.16.1.0 milestone Nov 25, 2021
@sureshanaparti
Copy link
Copy Markdown
Contributor

@shwstppr can you fix the conflicts

@shwstppr shwstppr force-pushed the fix-unmanagedinstances-listing branch from 34a939d to 527146a Compare January 28, 2022 08:59
When the host and last host ID is not set for the VM, it may appear in the list of unmanaged instances.
This changes fixes the behaviour by filtering unmanaged instances list for host for following three criteria:
- host is set as host_id for the VM
- host is set as the last_host_id for the VM
- pod of the host is set as the pod_id for the VM and both host_id and last_host_id is NULL

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
@shwstppr shwstppr force-pushed the fix-unmanagedinstances-listing branch from 527146a to 6f20a8e Compare January 28, 2022 09:10
@shwstppr shwstppr marked this pull request as ready for review January 28, 2022 09:11
@shwstppr
Copy link
Copy Markdown
Contributor Author

@DaanHoogland Zone had two VMs. One managed VM - t1. Other unmanaged VM - test123. In the tests, without PR changes ACS wrongly lists both VMs as unmanaged VMs. After changes ACS correctly lists only test123 as the unmanaged VM.

@blueorangutan package

@blueorangutan
Copy link
Copy Markdown

@shwstppr a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@blueorangutan
Copy link
Copy Markdown

Packaging result: ✔️ el7 ✔️ el8 ✔️ debian ✔️ suse15. SL-JID 2364

@sureshanaparti
Copy link
Copy Markdown
Contributor

@blueorangutan test centos7 vmware-67u3

@blueorangutan
Copy link
Copy Markdown

@sureshanaparti a Trillian-Jenkins test job (centos7 mgmt + vmware-67u3) has been kicked to run smoke tests

Copy link
Copy Markdown
Contributor

@sureshanaparti sureshanaparti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code LGTM

@DaanHoogland
Copy link
Copy Markdown
Contributor

@shwstppr I can see the managed but not the unmanaged instances:
image
I had unmanaged i-2-7-VM and i-2-11-VM and they do not show up in the UI
image

@blueorangutan
Copy link
Copy Markdown

Trillian test result (tid-3049)
Environment: vmware-67u3 (x2), Advanced Networking with Mgmt server 7
Total time taken: 38426 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr5400-t3049-vmware-67u3.zip
Smoke tests completed. 91 look OK, 1 have errors
Only failed tests results shown below:

Test Result Time (s) Test File
test_01_unmanage_vm_cycle Error 41.11 test_vm_life_cycle.py

@shwstppr shwstppr marked this pull request as draft January 31, 2022 03:16
@shwstppr
Copy link
Copy Markdown
Contributor Author

Moving to draft while I retest and fix the issue

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
@shwstppr
Copy link
Copy Markdown
Contributor Author

@blueorangutan package

@blueorangutan
Copy link
Copy Markdown

@shwstppr a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@blueorangutan
Copy link
Copy Markdown

Packaging result: ✔️ el7 ✔️ el8 ✔️ debian ✔️ suse15. SL-JID 2393

@shwstppr
Copy link
Copy Markdown
Contributor Author

@DaanHoogland can you please give it a try now

@shwstppr shwstppr marked this pull request as ready for review January 31, 2022 05:53
@sureshanaparti
Copy link
Copy Markdown
Contributor

@blueorangutan test centos7 vmware-67u3 keepEnv

@blueorangutan
Copy link
Copy Markdown

@sureshanaparti a Trillian-Jenkins test job (centos7 mgmt + vmware-67u3) has been kicked to run smoke tests

Signed-off-by: Abhishek Kumar <abhishek.mrt22@gmail.com>
@shwstppr
Copy link
Copy Markdown
Contributor Author

@blueorangutan package

@blueorangutan
Copy link
Copy Markdown

@shwstppr a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@blueorangutan
Copy link
Copy Markdown

Packaging result: ✔️ el7 ✔️ el8 ✔️ debian ✔️ suse15. SL-JID 2404

@DaanHoogland DaanHoogland self-assigned this Jan 31, 2022
@apache apache deleted a comment from blueorangutan Jan 31, 2022
@apache apache deleted a comment from blueorangutan Jan 31, 2022
@apache apache deleted a comment from blueorangutan Feb 1, 2022
@apache apache deleted a comment from blueorangutan Feb 2, 2022
@apache apache deleted a comment from blueorangutan Feb 2, 2022
@DaanHoogland
Copy link
Copy Markdown
Contributor

good work @shwstppr , I now see both lists populated. I'll restart a set of smoke tests.
@blueorangutan test centos7 vmware-67u3

@blueorangutan
Copy link
Copy Markdown

@DaanHoogland a Trillian-Jenkins test job (centos7 mgmt + vmware-67u3) has been kicked to run smoke tests

Copy link
Copy Markdown
Contributor

@DaanHoogland DaanHoogland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clgtm
manually tested

@blueorangutan
Copy link
Copy Markdown

Trillian test result (tid-3119)
Environment: vmware-67u3 (x2), Advanced Networking with Mgmt server 7
Total time taken: 47882 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr5400-t3119-vmware-67u3.zip
Smoke tests completed. 91 look OK, 1 have errors
Only failed tests results shown below:

Test Result Time (s) Test File
test_01_invalid_upgrade_kubernetes_cluster Failure 3612.37 test_kubernetes_clusters.py
test_02_upgrade_kubernetes_cluster Failure 3604.26 test_kubernetes_clusters.py
test_03_deploy_and_scale_kubernetes_cluster Failure 0.04 test_kubernetes_clusters.py
test_04_autoscale_kubernetes_cluster Failure 0.03 test_kubernetes_clusters.py
test_05_basic_lifecycle_kubernetes_cluster Failure 0.04 test_kubernetes_clusters.py
test_06_delete_kubernetes_cluster Failure 0.04 test_kubernetes_clusters.py
test_07_deploy_kubernetes_ha_cluster Failure 0.05 test_kubernetes_clusters.py
test_08_upgrade_kubernetes_ha_cluster Failure 0.04 test_kubernetes_clusters.py
test_09_delete_kubernetes_ha_cluster Failure 0.04 test_kubernetes_clusters.py
ContextSuite context=TestKubernetesCluster>:teardown Error 40.91 test_kubernetes_clusters.py

@sureshanaparti
Copy link
Copy Markdown
Contributor

Trillian test result (tid-3119) Environment: vmware-67u3 (x2), Advanced Networking with Mgmt server 7 Total time taken: 47882 seconds Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr5400-t3119-vmware-67u3.zip Smoke tests completed. 91 look OK, 1 have errors Only failed tests results shown below:

Test Result Time (s) Test File
test_01_invalid_upgrade_kubernetes_cluster Failure 3612.37 test_kubernetes_clusters.py
test_02_upgrade_kubernetes_cluster Failure 3604.26 test_kubernetes_clusters.py
test_03_deploy_and_scale_kubernetes_cluster Failure 0.04 test_kubernetes_clusters.py
test_04_autoscale_kubernetes_cluster Failure 0.03 test_kubernetes_clusters.py
test_05_basic_lifecycle_kubernetes_cluster Failure 0.04 test_kubernetes_clusters.py
test_06_delete_kubernetes_cluster Failure 0.04 test_kubernetes_clusters.py
test_07_deploy_kubernetes_ha_cluster Failure 0.05 test_kubernetes_clusters.py
test_08_upgrade_kubernetes_ha_cluster Failure 0.04 test_kubernetes_clusters.py
test_09_delete_kubernetes_ha_cluster Failure 0.04 test_kubernetes_clusters.py
ContextSuite context=TestKubernetesCluster>:teardown Error 40.91 test_kubernetes_clusters.py

these failures ^^^ are unrelated to the changes in this PR.

@sureshanaparti sureshanaparti merged commit 638779c into apache:4.16 Feb 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

No open projects
Status: Done

Development

Successfully merging this pull request may close these issues.

5 participants